Nuremberg
Robot Talk Episode 116 – Evolved behaviour for robot teams, with Tanja Kaiser
Claire chatted to Tanja Katharina Kaiser from the University of Technology Nuremberg about how applying evolutionary principles can help robot teams make better decisions. Tanja Katharina Kaiser is a senior researcher heading the Multi-Robot Systems Satellite Lab at the University of Technology Nuremberg (UTN) in Germany. She and her team focus on the development of adaptive multi-robot systems to solve complex real-world tasks using artificial intelligence. Tanja received her doctorate in robotics from the University of Lübeck in Germany in 2022. Before joining UTN, she held postdoctoral research positions at the Technical University of Dresden and the University of Konstanz.
Hi-ALPS -- An Experimental Robustness Quantification of Six LiDAR-based Object Detection Systems for Autonomous Driving
Arzberger, Alexandra, Kolagari, Ramin Tavakoli
Light Detection and Ranging (LiDAR) is an essential sensor technology for autonomous driving as it can capture high-resolution 3D data. As 3D object detection systems (OD) can interpret such point cloud data, they play a key role in the driving decisions of autonomous vehicles. Consequently, such 3D OD must be robust against all types of perturbations and must therefore be extensively tested. One approach is the use of adversarial examples, which are small, sometimes sophisticated perturbations in the input data that change, i.e., falsify, the prediction of the OD. These perturbations are carefully designed based on the weaknesses of the OD. The robustness of the OD cannot be quantified with adversarial examples in general, because if the OD is vulnerable to a given attack, it is unclear whether this is due to the robustness of the OD or whether the attack algorithm produces particularly strong adversarial examples. The contribution of this work is Hi-ALPS -- Hierarchical Adversarial-example-based LiDAR Perturbation Level System, where higher robustness of the OD is required to withstand the perturbations as the perturbation levels increase. In doing so, the Hi-ALPS levels successively implement a heuristic followed by established adversarial example approaches. In a series of comprehensive experiments using Hi-ALPS, we quantify the robustness of six state-of-the-art 3D OD under different types of perturbations. The results of the experiments show that none of the OD is robust against all Hi-ALPS levels; an important factor for the ranking is that human observers can still correctly recognize the perturbed objects, as the respective perturbations are small. To increase the robustness of the OD, we discuss the applicability of state-of-the-art countermeasures. In addition, we derive further suggestions for countermeasures based on our experimental results.
Refined Policy Distillation: From VLA Generalists to RL Experts
Jülg, Tobias, Burgard, Wolfram, Walter, Florian
Recent generalist Vision-Language-Action Models (VLAs) can perform a variety of tasks on real robots with remarkable generalization capabilities. However, reported success rates are often not on par with those of expert policies. Moreover, VLAs usually do not work out of the box and often must be fine-tuned as they are sensitive to setup changes. In this work, we present Refined Policy Distillation (RPD), an RL-based policy refinement method that enables the distillation of large generalist models into small, high-performing expert policies. The student policy is guided during the RL exploration by actions of a teacher VLA for increased sample efficiency and faster convergence. Different from previous work that focuses on applying VLAs to real-world experiments, we create fine-tuned versions of Octo and OpenVLA for ManiSkill2 to evaluate RPD in simulation. As our results for different manipulation tasks demonstrate, RPD enables the RL agent to learn expert policies that surpass the teacher's performance in both dense and sparse reward settings. Our approach is even robust to changes in the camera perspective and can generalize to task variations that the underlying VLA cannot solve.
LiDAR Registration with Visual Foundation Models
Vödisch, Niclas, Cioffi, Giovanni, Cannici, Marco, Burgard, Wolfram, Scaramuzza, Davide
LiDAR Registration with Visual Foundation Models Niclas V odisch 1,2, Giovanni Cioffi 2, Marco Cannici 2, Wolfram Burgard 3, and Davide Scaramuzza 2 1 University of Freiburg 2 University of Zurich 3 University of Technology Nuremberg Abstract --LiDAR registration is a fundamental task in robotic mapping and localization. A critical component of aligning two point clouds is identifying robust point correspondences using point descriptors. This step becomes particularly challenging in scenarios involving domain shifts, seasonal changes, and variations in point cloud structures. In this paper, we address these problems by proposing to use DINOv2 features, obtained from surround-view images, as point descriptors. We demonstrate that coupling these descriptors with traditional registration algorithms, such as RANSAC or ICP, facilitates robust 6DoF alignment of LiDAR scans with 3D maps, even when the map was recorded more than a year before. Although conceptually straightforward, our method substantially outperforms more complex baseline techniques. In contrast to previous learning-based point descriptors, our method does not require domain-specific retraining and is agnostic to the point cloud structure, effectively handling both sparse LiDAR scans and dense 3D maps. We show that leveraging the additional camera data enables our method to outperform the best baseline by +24.8 and +17. 3 registration recall on the NCL T and Oxford RobotCar datasets. We publicly release the registration benchmark and the code of our work on https://vfm-registration.cs.uni-freiburg.de. I NTRODUCTION Aligning two point clouds to compute their relative 3D transformation is a critical task in numerous robotic applications, including LiDAR odometry [30], loop closure registration [2], and map-based localization [19]. In this work, we specifically discuss map-based localization, which not only generalizes the other aforementioned tasks but is also critical for improving the efficiency and autonomy of mobile robots in environments where pre-existing map data is available.
(Neural-Symbolic) Machine Learning for Inconsistency Measurement
We present machine-learning-based approaches for determining the \emph{degree} of inconsistency -- which is a numerical value -- for propositional logic knowledge bases. Specifically, we present regression- and neural-based models that learn to predict the values that the inconsistency measures $\incmi$ and $\incat$ would assign to propositional logic knowledge bases. Our main motivation is that computing these values conventionally can be hard complexity-wise. As an important addition, we use specific postulates, that is, properties, of the underlying inconsistency measures to infer symbolic rules, which we combine with the learning-based models in the form of constraints. We perform various experiments and show that a) predicting the degree values is feasible in many situations, and b) including the symbolic constraints deduced from the rationality postulates increases the prediction quality.
The Role of Generative AI in Software Student CollaborAItion
Kiesler, Natalie, Smith, Jacqueline, Leinonen, Juho, Fox, Armando, MacNeil, Stephen, Ihantola, Petri
Collaboration is a crucial part of computing education. The increase Khan [28] has proposed an inspiring vision of how AI could in AI capabilities over the last couple of years is bound to profoundly help realize personalized individual tutors for every learner. Complementing affect all aspects of systems and software engineering, including this, an expert panel from 2020 [49] draws a scenario collaboration. In this position paper, we consider a scenario where where "AI supports orchestration of the multiple types of activities, AI agents would be able to take on any role in collaborative processes learning partners, and interaction patterns that can enrich a classroom". in computing education. We outline these roles, the activities We believe the possibilities are even broader, and to help and group dynamics that software development currently include, think about them, we propose a thought experiment that not only and discuss if and in what way AI could facilitate these roles and accommodates emerging practices and visions but also suggests activities. The goal of our work is to envision and critically examine new use cases in education that (to the best of our knowledge) have potential futures. We present scenarios suggesting how AI not yet been explored.
Digital Operating Mode Classification of Real-World Amateur Radio Transmissions
Bundscherer, Maximilian, Schmitt, Thomas H., Baumann, Ilja, Bocklet, Tobias
This study presents an ML approach for classifying digital radio operating modes evaluated on real-world transmissions. We generated 98 different parameterized radio signals from 17 digital operating modes, transmitted each of them on the 70 cm (UHF) amateur radio band, and recorded our transmissions with two different architectures of SDR receivers. Three lightweight ML models were trained exclusively on spectrograms of limited non-transmitted signals with random characters as payloads. This training involved an online data augmentation pipeline to simulate various radio channel impairments. Our best model, EfficientNetB0, achieved an accuracy of 93.80% across the 17 operating modes and 85.47% across all 98 parameterized radio signals, evaluated on our real-world transmissions with Wikipedia articles as payloads. Furthermore, we analyzed the impact of varying signal durations & the number of FFT bins on classification, assessed the effectiveness of our simulated channel impairments, and tested our models across multiple simulated SNRs.
Radio Foundation Models: Pre-training Transformers for 5G-based Indoor Localization
Ott, Jonathan, Pirkl, Jonas, Stahlke, Maximilian, Feigl, Tobias, Mutschler, Christopher
Artificial Intelligence (AI)-based radio fingerprinting (FP) outperforms classic localization methods in propagation environments with strong multipath effects. However, the model and data orchestration of FP are time-consuming and costly, as it requires many reference positions and extensive measurement campaigns for each environment. Instead, modern unsupervised and self-supervised learning schemes require less reference data for localization, but either their accuracy is low or they require additional sensor information, rendering them impractical. In this paper we propose a self-supervised learning framework that pre-trains a general transformer (TF) neural network on 5G channel measurements that we collect on-the-fly without expensive equipment. Our novel pretext task randomly masks and drops input information to learn to reconstruct it. So, it implicitly learns the spatiotemporal patterns and information of the propagation environment that enable FP-based localization. Most interestingly, when we optimize this pre-trained model for localization in a given environment, it achieves the accuracy of state-of-the-art methods but requires ten times less reference data and significantly reduces the time from training to operation.